Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ''un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.more » « less
-
As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, ``un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.more » « less
-
Hydrogels showing strong adhesion to different substrates have garnered significant attention for engineering applications. However, the current development of such hydrogel-based adhesive is predominantly limited to synthetic polymers, owing to their exceptional performance and an extensive array of chemical options. To advance the development of sustainable hydrogel-based adhesives, we successfully create a highly robust all-cellulose hydrogel-based adhesive, which is composed of concentrated dialcohol cellulose nanorods (DCNRs) and relies on enhanced hydrogen bonding interactions between cellulose and the substrate. We implement a sequential oxidization-reduction process to achieve this high-performance all-cellulose hydrogel, which is realized by converting the two secondary hydroxyl groups within an anhydroglucose unit into two primary hydroxyl groups, while simultaneously linearizing the cellulose chains. Such structural and chemical modifications on cellulose chains increase out-of-plane interactions between the DCNRs hydrogel and substrate, as simulations indicate. Additionally, these modifications enhance the flexibility of the cellulose chains, which would otherwise be rigid. The resulting all-cellulose hydrogels demonstrate injectability and strong adhesion capability to a wide range of substrates, including wood, metal, glass, and plastic. This green and sustainable all-cellulose hydrogel-based adhesive holds great promise for future bio-based adhesive design.more » « less
-
We consider information design in spatial resource competition, motivated by ride sharing platforms sharing information with drivers about rider demand. Each of N co-located agents (drivers) decides whether to move to another location with an uncertain and possibly higher resource level (rider demand), where the utility for moving increases in the resource level and decreases in the number of other agents that move. A principal who can observe the resource level wishes to share this information in a way that ensures a welfare-maximizing number of agents move. Analyzing the principal’s information design problem using the Bayesian persuasion framework, we study both private signaling mechanisms, where the principal sends personalized signals to each agent, and public signaling mechanisms, where the principal sends the same information to all agents. We show: 1) For private signaling, computing the optimal mechanism using the standard approach leads to a linear program with 2 N variables, rendering the computation challenging. We instead describe a computationally efficient two-step approach to finding the optimal private signaling mechanism. First, we perform a change of variables to solve a linear program with O(N^2) variables that provides the marginal probabilities of recommending each agent move. Second, we describe an efficient sampling procedure over sets of agents consistent with these optimal marginal probabilities; the optimal private mechanism then asks the sampled set of agents to move and the rest to stay. 2) For public signaling, we first show the welfare-maximizing equilibrium given any common belief has a threshold structure. Using this, we show that the optimal public mechanism with respect to the sender-preferred equilibrium can be computed in polynomial time. 3) We support our analytical results with numerical computations that show the optimal private and public signaling mechanisms achieve substantially higher social welfare when compared with no-information and full-information benchmarks.more » « less
-
Characterization of paste flow is important in ensuring rheological control during printing. The interaction between the rheological characteristics and processing parameters are better studied through a combination of experimental and simulation tools. For fresh pastes and concrete, discrete element method (DEM)-based simulations are appropriate to provide insights into the particle scale processes occurring during extrusion-based printing, and to relate them to the macro-scale response of the entire system. In this paper, we model the extrusion process of a plain ordinary Portland cement (OPC) paste using DEM, and outline the methodology adopted to evaluate the linkage between particle scale processes and extrusion process. An analytical model for a frictional plastic material undergoing ram extrusion is also used in conjunction with the DEM model to arrive at the yield stresses and shaping stresses that enable efficient extrusion process, as a function of the material microstructure.more » « less
An official website of the United States government

Full Text Available